Rater Effects as a Function of Rater Training Context
نویسندگان
چکیده
This study examined the influence of rater training and scoring context on the manifestation of rater effects in a group of trained raters. 120 raters participated in the study and experienced one of three training/scoring contexts: (a) online training in a distributed scoring context, (b) online training in a regional scoring context, and (c) stand-up training in a regional context. After training, raters assigned scores on a four-point scale to 400 student essays. Ratings were scaled to a Rasch rating scale model, and several indices were computed for the sake of determining the degree to which individual raters manifested evidence of severity, inaccuracy, and centrality in the ratings that were assigned. The results indicate that latent trait indicators of leniency and inaccuracy are nearly perfectly correlated with raw score indicators of those rater effects. The results also reveal that the expected-residual correlation may be the most direct latent trait indictor of rater centrality. The results of this study also suggest that raters who are trained and score in online distributed environments may be less likely to exhibit centrality and inaccuracy effects.
منابع مشابه
A Study of Raters’ Behavior in Scoring L2 Speaking Performance: Using Rater Discussion as a Training Tool
The studies conducted so far on the effectiveness of resolution methods including the discussion method in resolving discrepancies in rating have yielded mixed results. What is left unnoticed in the literature is the potential of discussion to be used as a training tool rather than a resolution method. The present study addresses this research gap by exploring the data coming from rating behavi...
متن کاملStudents’ Oral Assessment Considering Various Task Dimensions and Difficulty Factors
This study investigated students’ oral performance ability accounting for various oral analytical factors including fluency, lexical and structural complexity and accuracy with each subcategory. Accordingly, 20 raters scored the oral performances produced by 200 students and a quantitative design using a MANOVA test was used to investigate students’ score differences of various levels of langua...
متن کاملOn Rater Agreement and Rater Training
This paper first analyzed two studies on rater factors and rating criteria to raise the problem of rater agreement. After that the author reveals the causes of discrepencies in rating administration by discussing rater variability and rater bias. The author argues that rater bias can not be eliminated completely, we can only reduce the error to a cetain degree by training raters. The study on r...
متن کاملFugl-Meyer assessment of sensorimotor function after stroke: standardized training procedure for clinical practice and clinical trials.
BACKGROUND AND PURPOSE Outcome measurement fidelity within and between sites of multi-site, randomized, clinical trials is an essential element to meaningful trial outcomes. As important are the methods developed for randomized, clinical trials that can have practical utility for clinical practice. A standardized measurement method and rater training program were developed for the total Fugl-Me...
متن کاملRater reliability and rater effects of the Safe Driving Behavior Measure.
We used Safe Driving Behavior Measure (SDBM) to determine rater reliability and rater effects (erratic responses, severity, leniency) in three rater groups: 80 older drivers (mean age = 73.26, standard deviation = 5.30), 80 family members or caregivers (age range = 20-85 yr), and two driving evaluators. Rater agreement was significant only between the evaluators and the family members or caregi...
متن کامل